Overview

Dataset statistics

Number of variables23
Number of observations1481915
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 GiB
Average record size in memory933.0 B

Variable types

Numeric10
DateTime2
Text8
Categorical3

Alerts

zip is highly overall correlated with long and 1 other fieldsHigh correlation
lat is highly overall correlated with merch_latHigh correlation
long is highly overall correlated with zip and 1 other fieldsHigh correlation
merch_lat is highly overall correlated with latHigh correlation
merch_long is highly overall correlated with zip and 1 other fieldsHigh correlation
is_fraud is highly imbalanced (95.3%)Imbalance
amt is highly skewed (γ1 = 43.56282153)Skewed
trans_num has unique valuesUnique

Reproduction

Analysis started2023-10-13 04:34:13.187562
Analysis finished2023-10-13 04:37:23.877652
Duration3 minutes and 10.69 seconds
Software versionydata-profiling vv4.6.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

Distinct1126161
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean537337.46
Minimum0
Maximum1296674
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:24.020787image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile46297
Q1231654.5
median463235
Q3833691.5
95-th percentile1204169.6
Maximum1296674
Range1296674
Interquartile range (IQR)602037

Descriptive statistics

Standard deviation366977.01
Coefficient of variation (CV)0.68295444
Kurtosis-0.96196499
Mean537337.46
Median Absolute Deviation (MAD)277853
Skewness0.45382814
Sum7.9628845 × 1011
Variance1.3467212 × 1011
MonotonicityNot monotonic
2023-10-13T12:37:24.234216image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
461738 2
 
< 0.1%
161354 2
 
< 0.1%
357478 2
 
< 0.1%
117678 2
 
< 0.1%
78809 2
 
< 0.1%
221230 2
 
< 0.1%
43433 2
 
< 0.1%
262895 2
 
< 0.1%
322867 2
 
< 0.1%
213961 2
 
< 0.1%
Other values (1126151) 1481895
> 99.9%
ValueCountFrequency (%)
0 2
< 0.1%
1 2
< 0.1%
2 1
< 0.1%
3 2
< 0.1%
4 2
< 0.1%
5 2
< 0.1%
6 2
< 0.1%
7 2
< 0.1%
8 1
< 0.1%
9 2
< 0.1%
ValueCountFrequency (%)
1296674 1
< 0.1%
1296673 1
< 0.1%
1296672 1
< 0.1%
1296671 1
< 0.1%
1296669 1
< 0.1%
1296668 1
< 0.1%
1296667 1
< 0.1%
1296666 1
< 0.1%
1296665 1
< 0.1%
1296663 1
< 0.1%
Distinct1460892
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size11.3 MiB
Minimum2019-01-01 00:00:18
Maximum2020-12-31 23:59:34
2023-10-13T12:37:24.429988image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:24.638505image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

cc_num
Real number (ℝ)

Distinct999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1749841 × 1017
Minimum6.0416207 × 1010
Maximum4.9923464 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:24.835184image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum6.0416207 × 1010
5-th percentile6.3048488 × 1011
Q11.8004295 × 1014
median3.5214173 × 1015
Q34.6422555 × 1015
95-th percentile4.497914 × 1018
Maximum4.9923464 × 1018
Range4.9923463 × 1018
Interquartile range (IQR)4.4622125 × 1015

Descriptive statistics

Standard deviation1.309315 × 1018
Coefficient of variation (CV)3.1360958
Kurtosis6.1729868
Mean4.1749841 × 1017
Median Absolute Deviation (MAD)3.0764709 × 1015
Skewness2.8506603
Sum-6.633861 × 1018
Variance1.7143058 × 1036
MonotonicityNot monotonic
2023-10-13T12:37:25.042327image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.672269902 × 10133559
 
0.2%
6.304249875 × 10113547
 
0.2%
3.459339645 × 10143545
 
0.2%
4.642255475 × 10153540
 
0.2%
4.364010865 × 10153535
 
0.2%
2.712209726 × 10153530
 
0.2%
2.131124026 × 10143530
 
0.2%
4.716561797 × 10153527
 
0.2%
3.575789282 × 10153526
 
0.2%
6.011438889 × 10153526
 
0.2%
Other values (989) 1446550
97.6%
ValueCountFrequency (%)
6.041620718 × 10101754
0.1%
6.042292873 × 10101769
0.1%
6.042309813 × 1010583
 
< 0.1%
6.042785159 × 1010608
 
< 0.1%
6.048700208 × 1010603
 
< 0.1%
6.04905963 × 10101156
0.1%
6.049559311 × 1010599
 
< 0.1%
5.018029536 × 10111762
0.1%
5.018181333 × 10117
 
< 0.1%
5.018282048 × 1011590
 
< 0.1%
ValueCountFrequency (%)
4.992346398 × 10182330
0.2%
4.989847571 × 10181180
 
0.1%
4.980323468 × 1018605
 
< 0.1%
4.973530368 × 10181150
 
0.1%
4.958589672 × 10181757
0.1%
4.95682899 × 10182984
0.2%
4.911818931 × 10187
 
< 0.1%
4.906628656 × 10182922
0.2%
4.897067971 × 10181183
 
0.1%
4.890424427 × 10181747
0.1%
Distinct693
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size113.2 MiB
2023-10-13T12:37:25.329068image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length43
Median length36
Mean length23.128412
Min length13

Characters and Unicode

Total characters34274340
Distinct characters55
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfraud_Haag-Blanda
2nd rowfraud_Pouros-Conroy
3rd rowfraud_Wiza LLC
4th rowfraud_Jast Ltd
5th rowfraud_Pouros-Conroy
ValueCountFrequency (%)
and 541748
 
15.7%
llc 111493
 
3.2%
inc 105105
 
3.0%
sons 83735
 
2.4%
ltd 80863
 
2.3%
plc 75707
 
2.2%
group 57806
 
1.7%
fraud_kutch 11958
 
0.3%
fraud_schaefer 10693
 
0.3%
fraud_streich 10597
 
0.3%
Other values (804) 2364693
68.5%
2023-10-13T12:37:25.779055image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3326512
 
9.7%
r 3080922
 
9.0%
d 2444700
 
7.1%
e 2131357
 
6.2%
u 2123007
 
6.2%
n 2021193
 
5.9%
1972483
 
5.8%
f 1596911
 
4.7%
_ 1481915
 
4.3%
o 1291119
 
3.8%
Other values (45) 12804221
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25936890
75.7%
Uppercase Letter 3882622
 
11.3%
Space Separator 1972483
 
5.8%
Connector Punctuation 1481915
 
4.3%
Dash Punctuation 509193
 
1.5%
Other Punctuation 491237
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3326512
12.8%
r 3080922
11.9%
d 2444700
9.4%
e 2131357
 
8.2%
u 2123007
 
8.2%
n 2021193
 
7.8%
f 1596911
 
6.2%
o 1291119
 
5.0%
i 1234035
 
4.8%
t 997994
 
3.8%
Other values (15) 5689140
21.9%
Uppercase Letter
ValueCountFrequency (%)
L 544396
14.0%
C 356370
 
9.2%
S 344804
 
8.9%
B 318777
 
8.2%
H 298336
 
7.7%
K 247906
 
6.4%
G 219718
 
5.7%
R 207248
 
5.3%
M 204609
 
5.3%
P 182171
 
4.7%
Other values (15) 958287
24.7%
Other Punctuation
ValueCountFrequency (%)
, 458013
93.2%
' 33224
 
6.8%
Space Separator
ValueCountFrequency (%)
1972483
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1481915
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 509193
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29819512
87.0%
Common 4454828
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3326512
 
11.2%
r 3080922
 
10.3%
d 2444700
 
8.2%
e 2131357
 
7.1%
u 2123007
 
7.1%
n 2021193
 
6.8%
f 1596911
 
5.4%
o 1291119
 
4.3%
i 1234035
 
4.1%
t 997994
 
3.3%
Other values (40) 9571762
32.1%
Common
ValueCountFrequency (%)
1972483
44.3%
_ 1481915
33.3%
- 509193
 
11.4%
, 458013
 
10.3%
' 33224
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34274340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3326512
 
9.7%
r 3080922
 
9.0%
d 2444700
 
7.1%
e 2131357
 
6.2%
u 2123007
 
6.2%
n 2021193
 
5.9%
1972483
 
5.8%
f 1596911
 
4.7%
_ 1481915
 
4.3%
o 1291119
 
3.8%
Other values (45) 12804221
37.4%

category
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size95.4 MiB
gas_transport
150320 
grocery_pos
140903 
home
140289 
shopping_pos
133196 
kids_pets
129324 
Other values (9)
787883 

Length

Max length14
Median length12
Mean length10.526853
Min length4

Characters and Unicode

Total characters15599902
Distinct characters20
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfood_dining
2nd rowshopping_pos
3rd rowmisc_pos
4th rowshopping_net
5th rowshopping_pos

Common Values

ValueCountFrequency (%)
gas_transport 150320
10.1%
grocery_pos 140903
9.5%
home 140289
9.5%
shopping_pos 133196
9.0%
kids_pets 129324
8.7%
shopping_net 111281
7.5%
entertainment 107386
7.2%
food_dining 104776
 
7.1%
personal_care 104118
 
7.0%
health_fitness 98276
 
6.6%
Other values (4) 262046
17.7%

Length

2023-10-13T12:37:26.186158image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gas_transport 150320
10.1%
grocery_pos 140903
9.5%
home 140289
9.5%
shopping_pos 133196
9.0%
kids_pets 129324
8.7%
shopping_net 111281
7.5%
entertainment 107386
7.2%
food_dining 104776
 
7.1%
personal_care 104118
 
7.0%
health_fitness 98276
 
6.6%
Other values (4) 262046
17.7%

Most occurring characters

ValueCountFrequency (%)
s 1633613
10.5%
e 1471362
9.4%
o 1406839
9.0%
n 1364580
8.7%
p 1238069
 
7.9%
t 1230747
 
7.9%
_ 1187846
 
7.6%
r 1048116
 
6.7%
i 952840
 
6.1%
a 760932
 
4.9%
Other values (10) 3304958
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14412056
92.4%
Connector Punctuation 1187846
 
7.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1633613
11.3%
e 1471362
10.2%
o 1406839
9.8%
n 1364580
9.5%
p 1238069
8.6%
t 1230747
8.5%
r 1048116
7.3%
i 952840
 
6.6%
a 760932
 
5.3%
g 692303
 
4.8%
Other values (9) 2612655
18.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1187846
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14412056
92.4%
Common 1187846
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1633613
11.3%
e 1471362
10.2%
o 1406839
9.8%
n 1364580
9.5%
p 1238069
8.6%
t 1230747
8.5%
r 1048116
7.3%
i 952840
 
6.6%
a 760932
 
5.3%
g 692303
 
4.8%
Other values (9) 2612655
18.1%
Common
ValueCountFrequency (%)
_ 1187846
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15599902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1633613
10.5%
e 1471362
9.4%
o 1406839
9.0%
n 1364580
8.7%
p 1238069
 
7.9%
t 1230747
 
7.9%
_ 1187846
 
7.6%
r 1048116
 
6.7%
i 952840
 
6.1%
a 760932
 
4.9%
Other values (10) 3304958
21.2%

amt
Real number (ℝ)

SKEWED 

Distinct55373
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.01506
Minimum1
Maximum28948.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:26.388808image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.44
Q19.64
median47.44
Q383.09
95-th percentile195.34
Maximum28948.9
Range28947.9
Interquartile range (IQR)73.45

Descriptive statistics

Standard deviation160.63
Coefficient of variation (CV)2.2942206
Kurtosis4687.9012
Mean70.01506
Median Absolute Deviation (MAD)37.45
Skewness43.562822
Sum1.0375637 × 108
Variance25801.996
MonotonicityNot monotonic
2023-10-13T12:37:26.571276image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.14 642
 
< 0.1%
1.01 610
 
< 0.1%
1.08 610
 
< 0.1%
1.2 600
 
< 0.1%
1.1 599
 
< 0.1%
1.25 592
 
< 0.1%
1.02 590
 
< 0.1%
1.03 589
 
< 0.1%
1.04 588
 
< 0.1%
1.16 586
 
< 0.1%
Other values (55363) 1475909
99.6%
ValueCountFrequency (%)
1 267
< 0.1%
1.01 610
< 0.1%
1.02 590
< 0.1%
1.03 589
< 0.1%
1.04 588
< 0.1%
1.05 572
< 0.1%
1.06 527
< 0.1%
1.07 578
< 0.1%
1.08 610
< 0.1%
1.09 579
< 0.1%
ValueCountFrequency (%)
28948.9 1
< 0.1%
27390.12 1
< 0.1%
27119.77 1
< 0.1%
26544.12 1
< 0.1%
25086.94 1
< 0.1%
22768.11 1
< 0.1%
21437.71 1
< 0.1%
19364.91 1
< 0.1%
17897.24 1
< 0.1%
16837.08 1
< 0.1%

first
Text

Distinct355
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size89.1 MiB
2023-10-13T12:37:26.919870image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length11
Median length9
Mean length6.0800188
Min length3

Characters and Unicode

Total characters9010071
Distinct characters49
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKristen
2nd rowMary
3rd rowRebecca
4th rowAdam
5th rowDawn
ValueCountFrequency (%)
christopher 30432
 
2.1%
robert 24579
 
1.7%
jessica 23499
 
1.6%
david 22943
 
1.5%
james 22928
 
1.5%
michael 22865
 
1.5%
jennifer 19415
 
1.3%
william 18749
 
1.3%
john 18674
 
1.3%
mary 18650
 
1.3%
Other values (345) 1259181
85.0%
2023-10-13T12:37:27.505112image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1150721
 
12.8%
e 984435
 
10.9%
i 707020
 
7.8%
n 702293
 
7.8%
r 693648
 
7.7%
l 443775
 
4.9%
h 394646
 
4.4%
s 371004
 
4.1%
t 356040
 
4.0%
o 307422
 
3.4%
Other values (39) 2899067
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7528156
83.6%
Uppercase Letter 1481915
 
16.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1150721
15.3%
e 984435
13.1%
i 707020
9.4%
n 702293
9.3%
r 693648
9.2%
l 443775
 
5.9%
h 394646
 
5.2%
s 371004
 
4.9%
t 356040
 
4.7%
o 307422
 
4.1%
Other values (16) 1417152
18.8%
Uppercase Letter
ValueCountFrequency (%)
J 250298
16.9%
M 165626
11.2%
S 130907
8.8%
A 128963
8.7%
C 121223
8.2%
D 98504
 
6.6%
K 97781
 
6.6%
R 80237
 
5.4%
T 76123
 
5.1%
L 72046
 
4.9%
Other values (13) 260207
17.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 9010071
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1150721
 
12.8%
e 984435
 
10.9%
i 707020
 
7.8%
n 702293
 
7.8%
r 693648
 
7.7%
l 443775
 
4.9%
h 394646
 
4.4%
s 371004
 
4.1%
t 356040
 
4.0%
o 307422
 
3.4%
Other values (39) 2899067
32.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9010071
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1150721
 
12.8%
e 984435
 
10.9%
i 707020
 
7.8%
n 702293
 
7.8%
r 693648
 
7.7%
l 443775
 
4.9%
h 394646
 
4.4%
s 371004
 
4.1%
t 356040
 
4.0%
o 307422
 
3.4%
Other values (39) 2899067
32.2%

last
Text

Distinct486
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size89.2 MiB
2023-10-13T12:37:27.822003image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.1118411
Min length2

Characters and Unicode

Total characters9057229
Distinct characters48
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAllen
2nd rowWall
3rd rowObrien
4th rowStark
5th rowGray
ValueCountFrequency (%)
smith 32816
 
2.2%
williams 26944
 
1.8%
davis 25196
 
1.7%
johnson 22848
 
1.5%
rodriguez 19926
 
1.3%
martinez 16989
 
1.1%
jones 15911
 
1.1%
lewis 14581
 
1.0%
gonzalez 13464
 
0.9%
miller 13444
 
0.9%
Other values (476) 1279796
86.4%
2023-10-13T12:37:28.314036image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 898167
 
9.9%
r 753288
 
8.3%
a 740913
 
8.2%
n 695784
 
7.7%
o 666291
 
7.4%
l 559085
 
6.2%
s 557628
 
6.2%
i 498090
 
5.5%
t 330090
 
3.6%
h 261905
 
2.9%
Other values (38) 3095988
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7575314
83.6%
Uppercase Letter 1481915
 
16.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 898167
11.9%
r 753288
9.9%
a 740913
9.8%
n 695784
9.2%
o 666291
8.8%
l 559085
 
7.4%
s 557628
 
7.4%
i 498090
 
6.6%
t 330090
 
4.4%
h 261905
 
3.5%
Other values (15) 1614073
21.3%
Uppercase Letter
ValueCountFrequency (%)
M 181519
12.2%
W 121838
 
8.2%
S 120053
 
8.1%
C 106662
 
7.2%
B 96273
 
6.5%
R 94844
 
6.4%
H 93004
 
6.3%
G 86463
 
5.8%
J 82121
 
5.5%
P 75564
 
5.1%
Other values (13) 423574
28.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 9057229
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 898167
 
9.9%
r 753288
 
8.3%
a 740913
 
8.2%
n 695784
 
7.7%
o 666291
 
7.4%
l 559085
 
6.2%
s 557628
 
6.2%
i 498090
 
5.5%
t 330090
 
3.6%
h 261905
 
2.9%
Other values (38) 3095988
34.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9057229
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 898167
 
9.9%
r 753288
 
8.3%
a 740913
 
8.2%
n 695784
 
7.7%
o 666291
 
7.4%
l 559085
 
6.2%
s 557628
 
6.2%
i 498090
 
5.5%
t 330090
 
3.6%
h 261905
 
2.9%
Other values (38) 3095988
34.2%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size82.0 MiB
F
811696 
M
670219 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1481915
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowF
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
F 811696
54.8%
M 670219
45.2%

Length

2023-10-13T12:37:28.481457image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-13T12:37:28.692022image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
f 811696
54.8%
m 670219
45.2%

Most occurring characters

ValueCountFrequency (%)
F 811696
54.8%
M 670219
45.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1481915
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 811696
54.8%
M 670219
45.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 1481915
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 811696
54.8%
M 670219
45.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1481915
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 811696
54.8%
M 670219
45.2%

street
Text

Distinct999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size112.0 MiB
2023-10-13T12:37:28.968529image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length35
Median length29
Mean length22.231728
Min length12

Characters and Unicode

Total characters32945531
Distinct characters62
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8619 Lisa Manors Apt. 871
2nd row2481 Mills Lock
3rd row5619 Mendoza Inlet
4th row0912 Mark Fields Apt. 080
5th row9486 Joel Common Suite 554
ValueCountFrequency (%)
apt 375057
 
6.4%
suite 349293
 
5.9%
island 26310
 
0.4%
michael 21607
 
0.4%
islands 20496
 
0.3%
common 20482
 
0.3%
station 20455
 
0.3%
david 19914
 
0.3%
brooks 19301
 
0.3%
fields 18702
 
0.3%
Other values (1959) 5002828
84.9%
2023-10-13T12:37:29.470355image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4412530
 
13.4%
e 2048222
 
6.2%
a 1662290
 
5.0%
i 1480942
 
4.5%
t 1425355
 
4.3%
r 1261206
 
3.8%
n 1219452
 
3.7%
s 1181368
 
3.6%
l 1016532
 
3.1%
o 1000689
 
3.0%
Other values (52) 16236945
49.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16471821
50.0%
Decimal Number 7997943
24.3%
Space Separator 4412530
 
13.4%
Uppercase Letter 3688180
 
11.2%
Other Punctuation 375057
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2048222
12.4%
a 1662290
10.1%
i 1480942
9.0%
t 1425355
8.7%
r 1261206
 
7.7%
n 1219452
 
7.4%
s 1181368
 
7.2%
l 1016532
 
6.2%
o 1000689
 
6.1%
u 701368
 
4.3%
Other values (16) 3474397
21.1%
Uppercase Letter
ValueCountFrequency (%)
S 642045
17.4%
A 482819
13.1%
M 294697
 
8.0%
C 255165
 
6.9%
P 223938
 
6.1%
R 213248
 
5.8%
B 169409
 
4.6%
F 163860
 
4.4%
L 150561
 
4.1%
J 138640
 
3.8%
Other values (14) 953798
25.9%
Decimal Number
ValueCountFrequency (%)
5 855536
10.7%
3 846197
10.6%
2 840260
10.5%
7 803451
10.0%
1 792693
9.9%
8 790925
9.9%
0 774792
9.7%
6 774382
9.7%
4 766890
9.6%
9 752817
9.4%
Space Separator
ValueCountFrequency (%)
4412530
100.0%
Other Punctuation
ValueCountFrequency (%)
. 375057
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20160001
61.2%
Common 12785530
38.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2048222
 
10.2%
a 1662290
 
8.2%
i 1480942
 
7.3%
t 1425355
 
7.1%
r 1261206
 
6.3%
n 1219452
 
6.0%
s 1181368
 
5.9%
l 1016532
 
5.0%
o 1000689
 
5.0%
u 701368
 
3.5%
Other values (40) 7162577
35.5%
Common
ValueCountFrequency (%)
4412530
34.5%
5 855536
 
6.7%
3 846197
 
6.6%
2 840260
 
6.6%
7 803451
 
6.3%
1 792693
 
6.2%
8 790925
 
6.2%
0 774792
 
6.1%
6 774382
 
6.1%
4 766890
 
6.0%
Other values (2) 1127874
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32945531
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4412530
 
13.4%
e 2048222
 
6.2%
a 1662290
 
5.0%
i 1480942
 
4.5%
t 1425355
 
4.3%
r 1261206
 
3.8%
n 1219452
 
3.7%
s 1181368
 
3.6%
l 1016532
 
3.1%
o 1000689
 
3.0%
Other values (52) 16236945
49.3%

city
Text

Distinct906
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size92.8 MiB
2023-10-13T12:37:29.816904image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length25
Median length21
Mean length8.652312
Min length3

Characters and Unicode

Total characters12821991
Distinct characters52
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLagrange
2nd rowPlainfield
3rd rowJuliette
4th rowMc Veytown
5th rowTopeka
ValueCountFrequency (%)
city 24620
 
1.3%
west 22253
 
1.2%
north 16480
 
0.9%
saint 16380
 
0.9%
falls 14687
 
0.8%
new 13480
 
0.7%
lake 12928
 
0.7%
mount 12889
 
0.7%
san 11756
 
0.6%
springs 9947
 
0.5%
Other values (929) 1694492
91.6%
2023-10-13T12:37:30.326912image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1245001
 
9.7%
a 1068358
 
8.3%
n 939180
 
7.3%
o 934885
 
7.3%
l 891781
 
7.0%
r 856686
 
6.7%
i 804866
 
6.3%
t 684441
 
5.3%
s 510055
 
4.0%
367997
 
2.9%
Other values (42) 4518741
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10601722
82.7%
Uppercase Letter 1851092
 
14.4%
Space Separator 367997
 
2.9%
Dash Punctuation 1180
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1245001
11.7%
a 1068358
10.1%
n 939180
8.9%
o 934885
8.8%
l 891781
 
8.4%
r 856686
 
8.1%
i 804866
 
7.6%
t 684441
 
6.5%
s 510055
 
4.8%
d 353626
 
3.3%
Other values (15) 2312843
21.8%
Uppercase Letter
ValueCountFrequency (%)
C 179294
 
9.7%
M 169130
 
9.1%
S 155035
 
8.4%
B 152323
 
8.2%
H 132196
 
7.1%
W 108953
 
5.9%
P 105407
 
5.7%
L 99000
 
5.3%
R 90720
 
4.9%
A 85414
 
4.6%
Other values (15) 573620
31.0%
Space Separator
ValueCountFrequency (%)
367997
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1180
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12452814
97.1%
Common 369177
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1245001
 
10.0%
a 1068358
 
8.6%
n 939180
 
7.5%
o 934885
 
7.5%
l 891781
 
7.2%
r 856686
 
6.9%
i 804866
 
6.5%
t 684441
 
5.5%
s 510055
 
4.1%
d 353626
 
2.8%
Other values (40) 4163935
33.4%
Common
ValueCountFrequency (%)
367997
99.7%
- 1180
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12821991
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1245001
 
9.7%
a 1068358
 
8.3%
n 939180
 
7.3%
o 934885
 
7.3%
l 891781
 
7.0%
r 856686
 
6.7%
i 804866
 
6.3%
t 684441
 
5.3%
s 510055
 
4.0%
367997
 
2.9%
Other values (42) 4518741
35.2%

state
Text

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size83.4 MiB
2023-10-13T12:37:30.549833image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2963830
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWY
2nd rowNJ
3rd rowGA
4th rowPA
5th rowKS
ValueCountFrequency (%)
tx 108194
 
7.3%
ny 95669
 
6.5%
pa 91463
 
6.2%
ca 64402
 
4.3%
oh 53400
 
3.6%
mi 52566
 
3.5%
il 49669
 
3.4%
fl 48582
 
3.3%
al 46799
 
3.2%
mo 43861
 
3.0%
Other values (41) 827310
55.8%
2023-10-13T12:37:30.928345image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 406997
13.7%
N 325088
 
11.0%
M 251295
 
8.5%
I 208184
 
7.0%
T 176141
 
5.9%
L 169150
 
5.7%
O 164723
 
5.6%
C 161112
 
5.4%
Y 150642
 
5.1%
X 108194
 
3.7%
Other values (14) 842304
28.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2963830
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 406997
13.7%
N 325088
 
11.0%
M 251295
 
8.5%
I 208184
 
7.0%
T 176141
 
5.9%
L 169150
 
5.7%
O 164723
 
5.6%
C 161112
 
5.4%
Y 150642
 
5.1%
X 108194
 
3.7%
Other values (14) 842304
28.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 2963830
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 406997
13.7%
N 325088
 
11.0%
M 251295
 
8.5%
I 208184
 
7.0%
T 176141
 
5.9%
L 169150
 
5.7%
O 164723
 
5.6%
C 161112
 
5.4%
Y 150642
 
5.1%
X 108194
 
3.7%
Other values (14) 842304
28.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2963830
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 406997
13.7%
N 325088
 
11.0%
M 251295
 
8.5%
I 208184
 
7.0%
T 176141
 
5.9%
L 169150
 
5.7%
O 164723
 
5.6%
C 161112
 
5.4%
Y 150642
 
5.1%
X 108194
 
3.7%
Other values (14) 842304
28.4%

zip
Real number (ℝ)

HIGH CORRELATION 

Distinct985
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48814.501
Minimum1257
Maximum99921
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:31.142694image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1257
5-th percentile7208
Q126237
median48174
Q372011
95-th percentile94569
Maximum99921
Range98664
Interquartile range (IQR)45774

Descriptive statistics

Standard deviation26879.732
Coefficient of variation (CV)0.55065054
Kurtosis-1.0964545
Mean48814.501
Median Absolute Deviation (MAD)23068
Skewness0.079134573
Sum7.2338942 × 1010
Variance7.2251997 × 108
MonotonicityNot monotonic
2023-10-13T12:37:31.320494image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
73754 4090
 
0.3%
34112 4069
 
0.3%
82514 4062
 
0.3%
48088 4055
 
0.3%
85173 3559
 
0.2%
26292 3547
 
0.2%
21872 3545
 
0.2%
84540 3540
 
0.2%
89512 3535
 
0.2%
29819 3530
 
0.2%
Other values (975) 1444383
97.5%
ValueCountFrequency (%)
1257 2313
0.2%
1330 1194
0.1%
1535 592
 
< 0.1%
1545 1144
 
0.1%
1612 608
 
< 0.1%
1843 2885
0.2%
1844 2296
0.2%
2180 601
 
< 0.1%
2630 2364
0.2%
2908 594
 
< 0.1%
ValueCountFrequency (%)
99921 12
 
< 0.1%
99783 1753
0.1%
99747 11
 
< 0.1%
99746 572
 
< 0.1%
99323 2940
0.2%
99160 3499
0.2%
99116 12
 
< 0.1%
99113 1185
 
0.1%
99033 2882
0.2%
98836 595
 
< 0.1%

lat
Real number (ℝ)

HIGH CORRELATION 

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.537659
Minimum20.0271
Maximum66.6933
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:31.508404image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum20.0271
5-th percentile29.8826
Q134.6689
median39.3543
Q341.8948
95-th percentile45.8433
Maximum66.6933
Range46.6662
Interquartile range (IQR)7.2259

Descriptive statistics

Standard deviation5.0704933
Coefficient of variation (CV)0.13157243
Kurtosis0.78272213
Mean38.537659
Median Absolute Deviation (MAD)3.3597
Skewness-0.19402462
Sum57109535
Variance25.709902
MonotonicityNot monotonic
2023-10-13T12:37:31.683496image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36.385 4090
 
0.3%
26.1184 4069
 
0.3%
43.0048 4062
 
0.3%
42.5164 4055
 
0.3%
33.2887 3559
 
0.2%
39.1505 3547
 
0.2%
38.4121 3545
 
0.2%
38.9999 3540
 
0.2%
39.5483 3535
 
0.2%
34.0326 3530
 
0.2%
Other values (973) 1444383
97.5%
ValueCountFrequency (%)
20.0271 1737
0.1%
20.0827 1187
 
0.1%
24.6557 2922
0.2%
26.1184 4069
0.3%
26.3304 599
 
< 0.1%
26.3771 574
 
< 0.1%
26.4215 3498
0.2%
26.4722 2914
0.2%
26.529 1782
0.1%
26.6939 1170
 
0.1%
ValueCountFrequency (%)
66.6933 11
 
< 0.1%
65.6899 572
 
< 0.1%
64.7556 1753
0.1%
55.4732 12
 
< 0.1%
48.8878 3499
0.2%
48.8856 2338
0.2%
48.8328 1765
0.1%
48.6669 1180
 
0.1%
48.6031 3505
0.2%
48.4786 2334
0.2%

long
Real number (ℝ)

HIGH CORRELATION 

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.228007
Minimum-165.6723
Maximum-67.9503
Zeros0
Zeros (%)0.0%
Negative1481915
Negative (%)100.0%
Memory size11.3 MiB
2023-10-13T12:37:31.863133image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-165.6723
5-th percentile-119.0825
Q1-96.798
median-87.4769
Q3-80.158
95-th percentile-73.5365
Maximum-67.9503
Range97.722
Interquartile range (IQR)16.64

Descriptive statistics

Standard deviation13.745414
Coefficient of variation (CV)-0.15234088
Kurtosis1.8315168
Mean-90.228007
Median Absolute Deviation (MAD)8.1527
Skewness-1.145943
Sum-1.3371024 × 108
Variance188.93641
MonotonicityNot monotonic
2023-10-13T12:37:32.044592image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-98.0727 4090
 
0.3%
-81.7361 4069
 
0.3%
-108.8964 4062
 
0.3%
-82.9832 4055
 
0.3%
-111.0985 3559
 
0.2%
-79.503 3547
 
0.2%
-75.2811 3545
 
0.2%
-82.7243 3540
 
0.2%
-109.615 3540
 
0.2%
-119.7957 3535
 
0.2%
Other values (973) 1444373
97.5%
ValueCountFrequency (%)
-165.6723 1753
0.1%
-156.292 572
 
< 0.1%
-155.488 1187
0.1%
-155.3697 1737
0.1%
-153.994 11
 
< 0.1%
-133.1171 12
 
< 0.1%
-124.4409 1167
0.1%
-124.2174 1740
0.1%
-124.1587 1168
0.1%
-124.1437 1764
0.1%
ValueCountFrequency (%)
-67.9503 2303
0.2%
-68.5565 1188
 
0.1%
-69.2675 592
 
< 0.1%
-69.4828 2360
0.2%
-69.9576 596
 
< 0.1%
-69.9656 3482
0.2%
-70.1031 6
 
< 0.1%
-70.239 1137
 
0.1%
-70.3001 2364
0.2%
-70.3457 1741
0.1%

city_pop
Real number (ℝ)

Distinct891
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean88571.477
Minimum23
Maximum2906700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:32.226488image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile139
Q1741
median2443
Q320328
95-th percentile525713
Maximum2906700
Range2906677
Interquartile range (IQR)19587

Descriptive statistics

Standard deviation301272.82
Coefficient of variation (CV)3.4014654
Kurtosis37.564955
Mean88571.477
Median Absolute Deviation (MAD)2188
Skewness5.5900914
Sum1.312554 × 1011
Variance9.076531 × 1010
MonotonicityNot monotonic
2023-10-13T12:37:32.429284image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
606 6352
 
0.4%
1312922 5882
 
0.4%
1595797 5881
 
0.4%
241 5335
 
0.4%
1766 5239
 
0.4%
302 4712
 
0.3%
198 4674
 
0.3%
2135 4673
 
0.3%
1126 4671
 
0.3%
276002 4668
 
0.3%
Other values (881) 1429828
96.5%
ValueCountFrequency (%)
23 2318
0.2%
37 1169
 
0.1%
43 2363
0.2%
46 3540
0.2%
47 595
 
< 0.1%
49 1176
 
0.1%
51 1149
 
0.1%
52 593
 
< 0.1%
53 2920
0.2%
60 1184
 
0.1%
ValueCountFrequency (%)
2906700 4667
0.3%
2504700 2343
 
0.2%
2383912 586
 
< 0.1%
1595797 5881
0.4%
1577385 2928
0.2%
1526206 4071
0.3%
1417793 8
 
< 0.1%
1382480 2346
 
0.2%
1312922 5882
0.4%
1263321 4088
0.3%

job
Text

Distinct497
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size109.2 MiB
2023-10-13T12:37:32.733304image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length59
Median length38
Mean length20.23293
Min length3

Characters and Unicode

Total characters29983483
Distinct characters53
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowProduct/process development scientist
2nd rowLeisure centre manager
3rd rowTheatre manager
4th rowNutritional therapist
5th rowSecondary school teacher
ValueCountFrequency (%)
engineer 150561
 
4.6%
officer 126893
 
3.9%
manager 70214
 
2.1%
scientist 63722
 
1.9%
designer 59784
 
1.8%
surveyor 56116
 
1.7%
teacher 43778
 
1.3%
psychologist 37628
 
1.1%
research 33860
 
1.0%
editor 32822
 
1.0%
Other values (457) 2616002
79.5%
2023-10-13T12:37:33.215783image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3203585
 
10.7%
i 2726816
 
9.1%
r 2512106
 
8.4%
a 2074000
 
6.9%
t 2038492
 
6.8%
n 2017188
 
6.7%
1809465
 
6.0%
o 1706564
 
5.7%
s 1651280
 
5.5%
c 1512921
 
5.0%
Other values (43) 8731066
29.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26046581
86.9%
Space Separator 1809465
 
6.0%
Uppercase Letter 1565403
 
5.2%
Other Punctuation 506968
 
1.7%
Close Punctuation 27533
 
0.1%
Open Punctuation 27533
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3203585
12.3%
i 2726816
10.5%
r 2512106
9.6%
a 2074000
 
8.0%
t 2038492
 
7.8%
n 2017188
 
7.7%
o 1706564
 
6.6%
s 1651280
 
6.3%
c 1512921
 
5.8%
l 1142898
 
4.4%
Other values (16) 5460731
21.0%
Uppercase Letter
ValueCountFrequency (%)
C 179610
11.5%
E 166236
10.6%
P 164114
10.5%
S 156833
10.0%
T 129604
 
8.3%
M 101668
 
6.5%
A 100772
 
6.4%
F 78594
 
5.0%
D 66218
 
4.2%
R 63853
 
4.1%
Other values (11) 357901
22.9%
Other Punctuation
ValueCountFrequency (%)
, 357196
70.5%
/ 140938
 
27.8%
' 8834
 
1.7%
Space Separator
ValueCountFrequency (%)
1809465
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27533
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27533
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27611984
92.1%
Common 2371499
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3203585
11.6%
i 2726816
 
9.9%
r 2512106
 
9.1%
a 2074000
 
7.5%
t 2038492
 
7.4%
n 2017188
 
7.3%
o 1706564
 
6.2%
s 1651280
 
6.0%
c 1512921
 
5.5%
l 1142898
 
4.1%
Other values (37) 7026134
25.4%
Common
ValueCountFrequency (%)
1809465
76.3%
, 357196
 
15.1%
/ 140938
 
5.9%
) 27533
 
1.2%
( 27533
 
1.2%
' 8834
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29983483
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3203585
 
10.7%
i 2726816
 
9.1%
r 2512106
 
8.4%
a 2074000
 
6.9%
t 2038492
 
6.8%
n 2017188
 
6.7%
1809465
 
6.0%
o 1706564
 
5.7%
s 1651280
 
5.5%
c 1512921
 
5.0%
Other values (43) 8731066
29.1%

dob
Date

Distinct984
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.3 MiB
Minimum1924-10-30 00:00:00
Maximum2005-01-29 00:00:00
2023-10-13T12:37:33.421636image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:33.608938image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

trans_num
Text

UNIQUE 

Distinct1481915
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size125.8 MiB
2023-10-13T12:37:34.832150image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters47421280
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1481915 ?
Unique (%)100.0%

Sample

1st rowfd19c51e0b694609f42034aa3bf1830a
2nd row133cc647eb444e9344a5c15f6419ce23
3rd rowdcf293f8901483f86f9627dbd603e0cf
4th rowbf7c695b05563c0259f175783b11811c
5th row195277f89533d9a16ff4f3581b629c59
ValueCountFrequency (%)
fd19c51e0b694609f42034aa3bf1830a 1
 
< 0.1%
195277f89533d9a16ff4f3581b629c59 1
 
< 0.1%
8ae9ac14b13f6f0f1ab262b01dbf007c 1
 
< 0.1%
288d1ec2511320a8468523201f07ca48 1
 
< 0.1%
342c50d9c172c7693b729f995be2bd99 1
 
< 0.1%
4a9b5b066a9c3d0826875fe1381fdf73 1
 
< 0.1%
c60bc926c338fd86b23171d39a6fe0d2 1
 
< 0.1%
ed8bca1e004eb58b830d22e5f4ab7d08 1
 
< 0.1%
e3a32d1c7052529ef76c4d46f40b053d 1
 
< 0.1%
f29d2a7bbf5f57c19118ae671054549d 1
 
< 0.1%
Other values (1481905) 1481905
> 99.9%
2023-10-13T12:37:36.138725image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 2967212
 
6.3%
9 2966257
 
6.3%
7 2965584
 
6.3%
1 2965402
 
6.3%
3 2965160
 
6.3%
2 2964826
 
6.3%
d 2964458
 
6.3%
a 2963816
 
6.2%
f 2963653
 
6.2%
5 2963292
 
6.2%
Other values (6) 17771620
37.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29643962
62.5%
Lowercase Letter 17777318
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 2967212
10.0%
9 2966257
10.0%
7 2965584
10.0%
1 2965402
10.0%
3 2965160
10.0%
2 2964826
10.0%
5 2963292
10.0%
0 2962230
10.0%
8 2962041
10.0%
6 2961958
10.0%
Lowercase Letter
ValueCountFrequency (%)
d 2964458
16.7%
a 2963816
16.7%
f 2963653
16.7%
c 2963209
16.7%
b 2961300
16.7%
e 2960882
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 29643962
62.5%
Latin 17777318
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
4 2967212
10.0%
9 2966257
10.0%
7 2965584
10.0%
1 2965402
10.0%
3 2965160
10.0%
2 2964826
10.0%
5 2963292
10.0%
0 2962230
10.0%
8 2962041
10.0%
6 2961958
10.0%
Latin
ValueCountFrequency (%)
d 2964458
16.7%
a 2963816
16.7%
f 2963653
16.7%
c 2963209
16.7%
b 2961300
16.7%
e 2960882
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47421280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 2967212
 
6.3%
9 2966257
 
6.3%
7 2965584
 
6.3%
1 2965402
 
6.3%
3 2965160
 
6.3%
2 2964826
 
6.3%
d 2964458
 
6.3%
a 2963816
 
6.2%
f 2963653
 
6.2%
5 2963292
 
6.2%
Other values (6) 17771620
37.5%

unix_time
Real number (ℝ)

Distinct1460914
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3586718 × 109
Minimum1.325376 × 109
Maximum1.3885344 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:36.344016image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1.325376 × 109
5-th percentile1.3301045 × 109
Q11.343012 × 109
median1.3570821 × 109
Q31.3745664 × 109
95-th percentile1.3867823 × 109
Maximum1.3885344 × 109
Range63158356
Interquartile range (IQR)31554376

Descriptive statistics

Standard deviation18192852
Coefficient of variation (CV)0.013390174
Kurtosis-1.1995227
Mean1.3586718 × 109
Median Absolute Deviation (MAD)15787347
Skewness-0.019417778
Sum2.0134361 × 1015
Variance3.3097986 × 1014
MonotonicityNot monotonic
2023-10-13T12:37:36.510677image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1370177227 4
 
< 0.1%
1387312599 4
 
< 0.1%
1355835657 3
 
< 0.1%
1357998610 3
 
< 0.1%
1386274418 3
 
< 0.1%
1354985811 3
 
< 0.1%
1387062311 3
 
< 0.1%
1351896039 3
 
< 0.1%
1354458672 3
 
< 0.1%
1379203657 3
 
< 0.1%
Other values (1460904) 1481883
> 99.9%
ValueCountFrequency (%)
1325376018 1
< 0.1%
1325376044 1
< 0.1%
1325376051 1
< 0.1%
1325376076 1
< 0.1%
1325376186 1
< 0.1%
1325376248 1
< 0.1%
1325376282 1
< 0.1%
1325376308 1
< 0.1%
1325376361 1
< 0.1%
1325376383 1
< 0.1%
ValueCountFrequency (%)
1388534374 1
< 0.1%
1388534364 1
< 0.1%
1388534355 1
< 0.1%
1388534349 1
< 0.1%
1388534347 1
< 0.1%
1388534314 1
< 0.1%
1388534284 1
< 0.1%
1388534270 1
< 0.1%
1388534238 1
< 0.1%
1388534217 1
< 0.1%

merch_lat
Real number (ℝ)

HIGH CORRELATION 

Distinct1418540
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.537131
Minimum19.027422
Maximum67.510267
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.3 MiB
2023-10-13T12:37:36.712318image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum19.027422
5-th percentile29.751155
Q134.741212
median39.369006
Q341.953798
95-th percentile46.004211
Maximum67.510267
Range48.482845
Interquartile range (IQR)7.2125855

Descriptive statistics

Standard deviation5.104556
Coefficient of variation (CV)0.13245812
Kurtosis0.76465667
Mean38.537131
Median Absolute Deviation (MAD)3.387247
Skewness-0.1903319
Sum57108753
Variance26.056492
MonotonicityNot monotonic
2023-10-13T12:37:36.906579image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41.204635 4
 
< 0.1%
39.818513 4
 
< 0.1%
43.682006 4
 
< 0.1%
36.545289 4
 
< 0.1%
39.516582 4
 
< 0.1%
39.734227 4
 
< 0.1%
33.995152 4
 
< 0.1%
38.47822 4
 
< 0.1%
41.51721 4
 
< 0.1%
37.810182 4
 
< 0.1%
Other values (1418530) 1481875
> 99.9%
ValueCountFrequency (%)
19.027422 1
< 0.1%
19.027785 1
< 0.1%
19.027804 1
< 0.1%
19.027849 1
< 0.1%
19.029798 1
< 0.1%
19.031242 1
< 0.1%
19.032277 1
< 0.1%
19.032689 1
< 0.1%
19.033288 1
< 0.1%
19.034282 1
< 0.1%
ValueCountFrequency (%)
67.510267 1
< 0.1%
67.441518 1
< 0.1%
67.397018 1
< 0.1%
67.188111 1
< 0.1%
67.064277 1
< 0.1%
66.835174 1
< 0.1%
66.682905 1
< 0.1%
66.679297 1
< 0.1%
66.67355 1
< 0.1%
66.67154 1
< 0.1%

merch_long
Real number (ℝ)

HIGH CORRELATION 

Distinct1454618
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.228151
Minimum-166.67157
Maximum-66.950902
Zeros0
Zeros (%)0.0%
Negative1481915
Negative (%)100.0%
Memory size11.3 MiB
2023-10-13T12:37:37.091345image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-166.67157
5-th percentile-119.30615
Q1-96.903202
median-87.43976
Q3-80.245857
95-th percentile-73.369183
Maximum-66.950902
Range99.720673
Interquartile range (IQR)16.657345

Descriptive statistics

Standard deviation13.756976
Coefficient of variation (CV)-0.15246877
Kurtosis1.8252507
Mean-90.228151
Median Absolute Deviation (MAD)8.223868
Skewness-1.1429122
Sum-1.3371045 × 108
Variance189.25438
MonotonicityNot monotonic
2023-10-13T12:37:37.282568image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.900295 4
 
< 0.1%
-96.511763 4
 
< 0.1%
-74.618269 4
 
< 0.1%
-92.521318 4
 
< 0.1%
-80.893888 4
 
< 0.1%
-81.888319 3
 
< 0.1%
-80.285495 3
 
< 0.1%
-88.524861 3
 
< 0.1%
-85.971056 3
 
< 0.1%
-81.746112 3
 
< 0.1%
Other values (1454608) 1481880
> 99.9%
ValueCountFrequency (%)
-166.671575 1
< 0.1%
-166.671242 1
< 0.1%
-166.670006 1
< 0.1%
-166.66991 1
< 0.1%
-166.669638 1
< 0.1%
-166.666179 1
< 0.1%
-166.664828 1
< 0.1%
-166.661968 1
< 0.1%
-166.658797 1
< 0.1%
-166.657834 1
< 0.1%
ValueCountFrequency (%)
-66.950902 1
< 0.1%
-66.952026 1
< 0.1%
-66.952352 1
< 0.1%
-66.955602 1
< 0.1%
-66.95654 1
< 0.1%
-66.957364 1
< 0.1%
-66.958659 1
< 0.1%
-66.958751 1
< 0.1%
-66.959178 1
< 0.1%
-66.959498 1
< 0.1%

is_fraud
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size82.0 MiB
0
1474147 
1
 
7768

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1481915
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1474147
99.5%
1 7768
 
0.5%

Length

2023-10-13T12:37:37.445650image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-13T12:37:37.616328image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 1474147
99.5%
1 7768
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 1474147
99.5%
1 7768
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1481915
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1474147
99.5%
1 7768
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 1481915
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1474147
99.5%
1 7768
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1481915
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1474147
99.5%
1 7768
 
0.5%

Interactions

2023-10-13T12:37:06.483055image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:26.767191image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:31.274544image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:35.726193image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:40.519013image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:45.713757image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:50.153899image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:54.127764image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:58.210480image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:02.340974image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:06.885980image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:27.210008image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:31.735313image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:36.193944image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:40.969366image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:46.128161image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:50.566128image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:54.518698image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:58.661840image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:02.744401image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:07.272865image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:27.663793image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:32.146213image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:36.669471image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:41.382241image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:46.602599image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:50.938132image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:54.919637image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:59.060766image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:03.134166image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:07.753868image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:28.106169image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:32.589423image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:37.422113image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:41.884455image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:47.112235image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:51.320892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:55.298024image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:59.443020image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:03.567724image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:08.138747image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:28.568930image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:33.002321image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:37.951995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:42.324259image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:47.607146image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:51.738401image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:55.708481image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:59.837007image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:03.959407image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:08.702472image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:29.019725image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:33.396265image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:38.382842image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:42.925653image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:48.081876image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:52.113626image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:56.088851image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:00.220501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:04.353020image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:09.107970image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:29.490910image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:33.839175image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:38.841616image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:43.445607image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:48.515262image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:52.513049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:56.670295image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:00.677988image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:04.787855image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:09.570644image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:29.970956image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:34.247082image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:39.291412image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:44.164358image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:48.940243image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:52.920154image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:57.045775image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:01.104364image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:05.245631image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:10.057556image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:30.395818image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:34.779671image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:39.735226image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:44.699305image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:49.338909image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:53.324519image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:57.452365image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:01.554145image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:05.708392image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:10.524481image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:30.872620image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:35.260615image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:40.101795image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:45.230666image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:49.759001image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:53.773761image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:36:57.833346image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:01.968120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-13T12:37:06.090199image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-10-13T12:37:37.736569image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Unnamed: 0cc_numamtziplatlongcity_popunix_timemerch_latmerch_longcategorygenderis_fraud
Unnamed: 01.0000.0020.0000.000-0.000-0.001-0.0010.172-0.000-0.0010.0010.0000.015
cc_num0.0021.000-0.0010.014-0.003-0.0140.0490.001-0.004-0.0140.0090.0520.002
amt0.000-0.0011.0000.0020.013-0.001-0.024-0.0010.013-0.0010.0190.0000.000
zip0.0000.0140.0021.000-0.162-0.959-0.0400.001-0.162-0.9570.0110.1160.005
lat-0.000-0.0030.013-0.1621.0000.106-0.2640.0010.9910.1040.0100.1010.040
long-0.001-0.014-0.001-0.9590.1061.0000.087-0.0010.1050.9980.0090.0910.040
city_pop-0.0010.049-0.024-0.040-0.2640.0871.000-0.003-0.2630.0860.0140.0890.002
unix_time0.1720.001-0.0010.0010.001-0.001-0.0031.0000.001-0.0010.0010.0000.022
merch_lat-0.000-0.0040.013-0.1620.9910.105-0.2630.0011.0000.1040.0110.1020.040
merch_long-0.001-0.014-0.001-0.9570.1040.9980.086-0.0010.1041.0000.0090.0820.040
category0.0010.0090.0190.0110.0100.0090.0140.0010.0110.0091.0000.0540.067
gender0.0000.0520.0000.1160.1010.0910.0890.0000.1020.0820.0541.0000.006
is_fraud0.0150.0020.0000.0050.0400.0400.0020.0220.0400.0400.0670.0061.000

Missing values

2023-10-13T12:37:11.938529image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-13T12:37:14.969728image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Unnamed: 0trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
04617382019-07-22 18:27:076011381817520024fraud_Haag-Blandafood_dining21.98KristenAllenF8619 Lisa Manors Apt. 871LagrangeWY8222141.6423-104.1974635Product/process development scientist1973-07-13fd19c51e0b694609f42034aa3bf1830a134298162741.554200-103.2873800
112268082020-05-26 23:20:45180048185037117fraud_Pouros-Conroyshopping_pos2.71MaryWallF2481 Mills LockPlainfieldNJ706040.6152-74.415071485Leisure centre manager1974-07-19133cc647eb444e9344a5c15f6419ce23136961044541.082326-73.7846340
24108792019-07-05 11:18:22342351256941125fraud_Wiza LLCmisc_pos3.34RebeccaObrienF5619 Mendoza InletJulietteGA3104633.1194-83.82353343Theatre manager1990-06-08dcf293f8901483f86f9627dbd603e0cf134148710232.903079-83.7818370
34810472019-07-29 15:53:356011366578560244fraud_Jast Ltdshopping_net9.12AdamStarkM0912 Mark Fields Apt. 080Mc VeytownPA1705140.5046-77.71864653Nutritional therapist1997-07-01bf7c695b05563c0259f175783b11811c134357721541.125233-78.2675740
44684102019-07-25 18:11:183576021480694169fraud_Pouros-Conroyshopping_pos4.25DawnGrayF9486 Joel Common Suite 554TopekaKS6661839.1329-95.7023163415Secondary school teacher2004-12-30195277f89533d9a16ff4f3581b629c59134323987839.968570-95.7064710
53735422019-06-22 03:05:2538947654498698fraud_Spinka Incgrocery_net46.52LoriRodriguezF12087 Michael LightCreolaOH4562239.3543-82.5030321Copywriter, advertising1979-06-24df9eaa22784e41ae68f74c250b9b2fd2134033432538.616542-81.6204990
62670592019-05-12 16:09:16377993105397617fraud_Gulgowski LLChome12.17NathanMartinezM586 Thomas CliffsOconto FallsWI5415444.8755-88.15555548Mining engineer1975-09-118ae9ac14b13f6f0f1ab262b01dbf007c133683895645.419141-87.6472650
710452982020-03-09 15:40:183596217206093829fraud_Terry Ltdhome27.05SaraRamirezF23843 Scott IslandBirminghamIA5253540.8626-91.9534888Camera operator1988-03-25288d1ec2511320a8468523201f07ca48136284361840.346154-91.9330660
8238052019-01-14 21:19:484512828414983801773fraud_Bahringer Grouphealth_fitness7.19MonicaCohenF864 Reynolds PlainsUlediPA1548439.8936-79.7856328Tree surgeon1983-07-25342c50d9c172c7693b729f995be2bd99132657598840.744057-80.6967650
96184422019-09-20 09:01:3636153880429415fraud_Flatley Groupmisc_pos10.91ErikStevensM84033 Pitts OverpassLakelandFL3380928.1762-81.9591237282Plant breeder/geneticist1949-10-134a9b5b066a9c3d0826875fe1381fdf73134813169628.956241-81.0773620
Unnamed: 0trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
148190512794232020-06-15 05:12:394800395067176717fraud_Lockman Ltdgrocery_pos143.57DanielOwensM88794 Mandy Lodge Apt. 874HowellsNE6864141.6964-96.98581063Research scientist (maths)1928-04-02a454d53ab7be448248aa17ae3d99f05d137127315941.990784-97.4319090
14819068929372019-12-24 20:34:5830234966027947fraud_Morissette PLCshopping_pos5.99MatthewLambertM7188 Melissa Crest Apt. 151New HolsteinWI5306143.9446-88.09115196Child psychotherapist1978-01-22b579de399001f0170e700649046caedd135638129843.069894-88.6275890
148190711225152020-04-12 17:59:36503874407318fraud_Johns Incentertainment11.31AndrewMcgeeM4130 Tiffany Glen Apt. 562San AntonioTX7824829.5894-98.52011595797Exhibition designer1975-12-287ed7cf5c241b102d0f858ac141cdf8e4136578957629.590449-98.8973420
148190811876842020-05-11 09:36:124661996144291811856fraud_Kiehn-Emmerichgrocery_pos117.10LindaParkF24607 Charles MountainsFeneltonPA1603440.8555-79.73722054Operations geologist1963-08-046587290429dfe378981726d52784118e136826497241.220313-80.5449940
14819092415562020-09-16 13:12:00180018375329178fraud_Conroy Ltdshopping_pos184.69MichelleWoodsF952 Joseph ThroughwayMunithMI4925942.3703-84.24852523Geophysicist/field seismologist1988-03-21437d13b6e10cbf3a30888bcf706e0890137933712042.805996-84.7622640
148191010432892020-03-08 23:12:434306630852918fraud_Kozey-McDermotttravel7.16MaureenGarzaF169 Edward InletSaint LouisMO6313138.6171-90.4504927396Occupational hygienist1960-03-125632230607c0f634db4625203f3d3d34136278436338.109008-89.9179440
14819114127642019-07-06 03:59:034225990116481262579fraud_Hoppe, Harris and Bednarentertainment4.16BrianSimpsonM2711 Duran PinesHonokaaHI9672720.0827-155.48804878Physiotherapist1966-12-03111685f1f132266a0cecd2a64108fe89134154714320.741936-155.3936290
1481912752452019-02-13 20:11:593521417320836166fraud_Turner, Ziemann and Lehnerfood_dining84.69AngelaHodgesF08236 Kim HillIndianapolisIN4625439.8490-86.2720910148Firefighter1975-11-30246db647966358be9baa402b13c984d7132916391940.488363-86.2285430
14819137386642019-11-11 15:23:036011381817520024fraud_Graham and Sonshealth_fitness28.85KristenAllenF8619 Lisa Manors Apt. 871LagrangeWY8222141.6423-104.1974635Product/process development scientist1973-07-130d589021b00957e7341efd7a78fc5e7b135264738341.394869-104.7979620
14819145267352020-12-25 18:51:596534628260579800fraud_Jacobi and Sonsshopping_pos2.71ChristineHarrisF29606 Martinez Views Suite 653HinesburgVT546144.3346-73.09804542Claims inspector/assessor1998-03-199eb4470ad9e46f8e5b649f10e462988a138799751943.978527-73.5933720